Reducing the Number of Queries in Interactive Value Iteration
ثبت نشده
چکیده
To tackle the potentially hard task of defining the reward function in a Markov Decision Process (MDPs), a new approach, called Interactive Value Iteration (IVI) has recently been proposed by Weng and Zanuttini (2013). This solving method, which interweaves elicitation and optimization phases, computes a (near) optimal policy without knowing the precise reward values. The procedure as originally presented can be improved in order to reduce the number of queries needed to determine an optimal policy. The key insights are that 1) asking queries should be delayed as much as possible, avoiding asking queries that might not be necessary to determine the best policy, 2) queries should be asked by following a priority order because the answers to some queries can enable to resolve some other queries, 3) queries can be avoided by using heuristic information to guide the process. Following these ideas, a modified IVI algorithm is presented and experimental results show a significant decrease in the number of queries issued.
منابع مشابه
Reducing the Number of Queries in Interactive Value Iteration
To tackle the potentially hard task of defining the reward function in a Markov Decision Process (MDPs), a new approach, called Interactive Value Iteration (IVI) has recently been proposed by Weng and Zanuttini (2013). This solving method, which interweaves elicitation and optimization phases, computes a (near) optimal policy without knowing the precise reward values. The procedure as originall...
متن کاملAIDE: An Automated Sample-based Approach for Interactive Data Exploration
In this paper, we argue that database systems be augmented with an automated data exploration service that methodically steers users through the data in a meaningful way. Such an automated system is crucial for deriving insights from complex datasets found in many big data applications such as scientific and healthcare applications as well as for reducing the human effort of data exploration. T...
متن کاملAnalysis of users’ query reformulation behavior in Web with regard to Wholis-tic/analytic cognitive styles, Web experience, and search task type
Background and Aim: The basic aim of the present study is to investigate users’ query reformulation behavior with regard to wholistic-analytic cognitive styles, search task type, and experience variables in using the Web. Method: This study is an applied research using survey method. A total of 321 search queries were submitted by 44 users. Data collection tools were Riding’s Cognitive Style A...
متن کاملThe Impact of the Objective Complexity and Product of Work Task on Interactive Information Searching Behavior
Background and Aim: this study aimed to explore the impact of objective complexity and Product of work task on user's interactive information searching behavior. Method: The research population consisted of MSc students of Ferdowsi university of Mashhad enrolled in 2012-13 academic year. In 3 stages of sampling (random stratified, quota, and voluntary sampling), 30 cases were selected. Each of ...
متن کاملInteractive Value Iteration for Markov Decision Processes with Unknown Rewards
To tackle the potentially hard task of defining the reward function in a Markov Decision Process, we propose a new approach, based on Value Iteration, which interweaves the elicitation and optimization phases. We assume that rewards whose numeric values are unknown can only be ordered, and that a tutor is present to help comparing sequences of rewards. We first show how the set of possible rewa...
متن کامل